clValid , an R package for cluster validation

نویسندگان

  • Guy Brock
  • Vasyl Pihur
  • Susmita Datta
  • Somnath Datta
چکیده

The R package clValid contains functions for validating the results of a clustering analysis. There are three main types of cluster validation measures available, “internal”, “stability”, and “biological”. The user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), and model based clustering. In addition, we provide a function to perform the self-organizing tree algorithm (SOTA) method of clustering. Any combination of validation measures and clustering methods can be requested in a single function call. This allows the user to simultaneously evaluate several clustering algorithms while varying the number of clusters, to help determine the most appropriate method and number of clusters for the dataset of interest. Additionally, the

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

optCluster: An R Package for Determining the Optimal Clustering Algorithm

There exist numerous programs and packages that perform validation for a given clustering solution; however, clustering algorithms fare differently as judged by different validation measures. If more than one performance measure is used to evaluate multiple clustering partitions, an optimal result is often difficult to determine by visual inspection alone. This paper introduces optCluster, an R...

متن کامل

Title an Examination of Indices for Determining the Number of Clusters : Nbclust Package

May 23, 2012 Type Package Title An examination of indices for determining the number of clusters : NbClust Package Version 1.0 Date 2012-05-01 Author Malika Charrad and Nadia Ghazzali and Veronique Boiteau and Azam Niknafs Maintainer : Nadia Ghazzali <[email protected]....

متن کامل

An R package to process LC/MS metabolomic data: MAIT(Metabolite Automatic Identification Toolkit)

Processing metabolomic liquid chromatography and mass spectrometry (LC/MS) data files is time consuming. Currently available R tools allow for only a limited number of processing steps and online tools are hard to use in a programmable fashion. This paper introduces the metabolite automatic identification toolkit MAIT package, which allows users to perform endto-end LC/MS metabolomic data analy...

متن کامل

R SDisc: Integrated methodology for data subtype discovery

Cluster analysis is a statistical technique that aims to subset observations into groups, such that similar items are in the same clusters but are very different from items in other clusters. As a discovery tool, cluster analysis may enable to reveal associations, patterns, relationships, and structure in data. R SDisc is an additional tool to perform cluster analysis. However, instead of propo...

متن کامل

Development and Validation of Cognitive Behavioral Therapy Package Based on Detachment of Emotion and Goal for Adjunctive Psychotherapy of Individuals with Bipolar Disorder: A Descriptive Study

Background and Objectives: Bipolar disorder has a lot of cost for treatment system. Adjunctive psychotherapies are used in treatment of bipolar disorder for reducing signs, preventing recurrence and increasing drug adherence. The aim of current research was developing and validating cognitive behavioral therapy based on detachment of emotion and goal for bipolar disorder. Materials and Methods...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008